-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(insights): Attribute async queries to users #21019
Conversation
51e25b8
to
92bfec6
Compare
posthog/tasks/tasks.py
Outdated
@shared_task(ignore_result=True, queue=CeleryQueue.ANALYTICS_QUERIES.value) | ||
def process_query_task( | ||
team_id: str, query_id: str, query_json: Any, limit_context: Any = None, refresh_requested: bool = False | ||
team_id: str, | ||
user_id: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be tricky in terms of deployment, in that the number of arguments to the task changes. AFAIK this will mean Celery raising a TypeError
when processing a task queued pre-change on a post-change worker (Celery docs).
Is the rollout significant enough for this to disrupt anyone though?
Looking just at the async queries feature flag, more care doesn't seem worth the effort. Unless there are places that rely on async queries other than /query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this.
I commented with an idea to get around a task signature change.
Otherwise, I'd say it's also used so little that just deploying it like this would also be alright.
@@ -136,14 +139,23 @@ def enqueue_process_query_task( | |||
query_status = QueryStatus(id=query_id, team_id=team_id, start_time=datetime.datetime.now(datetime.timezone.utc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative idea: We put the user ID in the query status here, and thereby have it in the task, without the need to pass it to the task as args. As further advantages, it's then also persisted and returned via API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that could work too. Though I don't want to spend time reworking this right now to this degree, if the current solution works. 😄 It doesn't seem super useful right now to return the user ID via the API. I guess if that changes, it might be fine to adjust this
Problem
Async queries couldn't be viewed in the debug modal, due to not being attributed to users.
Changes
We'll now pass
user_id
to the task, and tag CH queries appropriately.Does this work well for both Cloud and self-hosted?
Yes.
How did you test this code?
Updated tests, especially
test_client_strips_comments_from_request
.